Structural, Transitive and Latent Models for Biographic Fact Extraction

نویسندگان

  • Nikesh Garera
  • David Yarowsky
چکیده

This paper presents six novel approaches to biographic fact extraction that model structural, transitive and latent properties of biographical data. The ensemble of these proposed models substantially outperforms standard pattern-based biographic fact extraction methods and performance is further improved by modeling inter-attribute correlations and distributions over functions of attributes, achieving an average extraction accuracy of 80% over seven types of biographic attributes.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multi-Field Information Extraction and Cross-Document Fusion

In this paper, we examine the task of extracting a set of biographic facts about target individuals from a collection of Web pages. We automatically annotate training text with positive and negative examples of fact extractions and train Rote, Naı̈ve Bayes, and Conditional Random Field extraction models for fact extraction from individual Web pages. We then propose and evaluate methods for fusin...

متن کامل

مدل معادلات ساختاری و کاربرد آن در مطالعات روانشناسی: یک مطالعه مروری

Introduction: Structural Equation Modeling (SEM) is a very general statistical modeling technique, which is widely used in the behavioral sciences. It can be viewed as a combination of path analysis, regression and factor analysis.  One of the prominent features of this method is the ability to compute direct, indirect and total effects, as well as latent variable modeling. Methods: This sy...

متن کامل

Modeling Latent Biographic Attributes in Conversational Genres

This paper presents and evaluates several original techniques for the latent classification of biographic attributes such as gender, age and native language, in diverse genres (conversation transcripts, email) and languages (Arabic, English). First, we present a novel partner-sensitive model for extracting biographic attributes in conversations, given the differences in lexical usage and discou...

متن کامل

نقش تحصیلات والدین و تعیین کننده های واسطه ای بر سلامت کودکان در ایران

Introduction: The study using national data, assessed relationship between Parent’s educations with child’s Health in Iran and looking for the role of intermediary variables. Method: In this Ecological study, we collected national data on parental education as predictive variables, Children’s health as response variable and housing condition, Child’s benefit from health ...

متن کامل

Fuzzy number-valued fuzzy ‎relation

It is well known fact that binary relations are generalized mathematical functions. Contrary to functions from domain to range, binary relations may assign to each element of domain two or more elements of range. Some basic operations on functions such as the inverse and composition are applicable to binary relations as well. Depending on the domain or range or both are fuzzy value fuzzy set, i...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009